ML Topics

1. Learning Model (Chapters 2 and 3)

  • Formal model with definitions
  • Propositions’ statements
  • Proof of Corollary 2.3
  • Definition of PAC learnability with 0-1 loss
  • Sample complexity for PAC learnability of finite hypotheses classes with 0-1 loss
  • Definition of agnostic PAC learnability for general loss functions
  • Definitions related to empirical and true error for general loss functions, etc.

2. Uniform Convergence (Chapter 4)

  • Definition of ε-representative sample (all details)
  • Lemma 4.2: statement and proof with all details
  • Definition of uniform convergence property: main idea (details of definition not required)
  • Corollary 4.4 main idea (details of statement not required)
  • Corollary 4.6: statement and proof with all details

3. Bias-Complexity Tradeoff (Chapter 5)

  • No Free Lunch (NFL) theorem, NFL and prior knowledge: only main idea
  • Corollary 5.2 : statement with all details (no proof)
  • Approximation error, estimation error, complexity and error decomposition: all details

4. VC-dimension (Chapter 6)

  • Restrictions, shattering, VC-dimension: definitions in detail
  • Fundamental Theorems of Statistical Learning and bound on generalization: in detail, as on the slides (no proof)

5. Linear Models (Chapter 9)

  • Linear predictors/models: definitions with all details
  • Linear regression: definitions, matrix form, derivation best predictor, use of generalized inverse in detail (derivation generalized inverse: not required)
  • R2: definition and interpretation in detail
  • Linear classification: perceptron: definitions and algorithm in detail
  • Proposition on perceptron convergence: only main idea (as in slide “Perceptron: Notes”)
  • VC-dimension of linear models: in detail
  • Logistic regression: all details

6. Model Selection and Validation (Chapter 11)

  • Validation: all details, including statement of bound on generalization error using validation set (Theorem 11.1 , no proof)
  • Validation for model selection: all details, including statement of bound on generalization error using validation set (Theorem 11.2 , no proof)
  • Model-selection curve, train-validation-test split, k-fold cross validation: all details
  • What if learning fails: main idea

7. Regularization and Feature Selection (Chapters 13 and 25 )

  • Regularized Loss Minimization: all details
  • l1 regularization, LASSO: all details
  • Tikhonov Regularization, Ridge Regression, derivation of optimal solution for Ridge Regression: all details
  • Subset selection, forward selection, backward selection, without and with validation data: all details

8. GD, SGD, and SVM (Chapters 14 and 15 )

  • Gradient descent (GD): all details
  • Stochastic gradient descent (SGD): all details
  • Hard-SVM optimization problem: all details
  • Hard-SVM: equivalent formulation, and quadratic formulation: main idea
  • Definition of support vectors for hard-SVM: all details
  • Soft-SVM optimization problem, hinge loss: all details
  • SGD for solving soft-SVM (algorithm): all details
  • Hard-SVM dual formulation: main idea
  • SVM for regression: all details only for optimization problem and support vectors definition

9. Kernels and SVM (Chapter 16 )

  • Definition of kernel: all details
  • Kernel trick for SVM: all details
  • Commonly used kernels for SVM: all details

10. Neural Networks and Deep Learning (Chapter 20 )

  • Neuron, activation function, network architecture, point of view of one node, hypothesis set, matrix notation: all details
  • General construction of NN for a given Boolean formula: all details
  • Expressiveness of NNs: main idea
  • Sample complexity, runtime of learning NNs: main idea
  • Forward propagation algorithm: all details
  • SGD and Backpropagation algorithm: main idea (pseudocode: only main structure)
  • Regularized NNs: main idea
  • CNNs: differences with standard NNs, convolution properties, convolutional layer, pooling layer, dropout, early stopping, data augmentation, cross entropy loss function: main idea

11. Clustering (Chapter 22 )

  • Unsupervised learning introduction: all details
  • Clustering definition and difficulties: main idea
  • Model for clustering: all details
  • k-means clustering and Lloyd’s algorithm: all details
  • Complexity of Lloyd’s algorithm: all details
  • k-means++: all details
  • Linkage-based clustering: all details
  • Silhouette score: all details